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Abstract 



In this paper, we are interested in the application to video segmentation of 
the discrete shape optimization problem 

O 
O 

AJW+V(a-/^ (1) 
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O ■ incorporating a data / = (ft) and a total variation function J, and where the 
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unknown = with #i G {0, 1} is a binary function representing the region to 
be segmented and a a parameter. Based on the recent works and Darbon and 
Sigelle |14II15| . we justify the equivalence of the shape optimization problem and 
a weighted TV regularization in the case where J is a "weighted" total variation. 
For solving this problem, we adapt the projection algorithm proposed in |10j to 
this case. Another way of solving investigated here is to use graph cuts. Both 
methods have the advantage to lead to a global minimum. 

Since we can distinguish moving objects from static elements of a scene by an- 
alyzing norm of the optical flow vectors, we choose / as the optical flow norm. 
In order to have the contour as close as possible to an edge in the image, we use 
a classical edge detector function as the weight of the weighted total variation. 
This model has been used in the former work . We also apply the same meth- 
ods to a video segmentation model used by Jehan-Besson, Barlaud and Aubert. 
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In this case, it is a direct but interesting application of as only standard 
perimeter is incorporated in the shape functional. We also propose another way 
for finding moving objects by using an a contrario detection of objects on the im- 
age obtained by solving the Rudin-Osher-Fatemi Total Variation regularization 
problem. We can notice the segmentation can be associated to a level set in the 
former methods. 

Keywords : total variation, motion detection, active contour models. 

1 Introduction 

Segmentation of moving objects from a video sequence is an important task whose 
applications cover domains such like video compression, video surveillance or object 
recognition. In video compression, the MPEG-4 video coding standard is based on the 
representation of the scene as different shapes-objects. This representation simplifies 
the scene and is used for the encoding of the sequence. 

There are different ways to perform moving objects segmentation, using different math- 
ematical techniques. For Markov Random Fields based methods, we refer to the works 
of Bouthemy ([5], and for maximum likelihood based methods, to the works of 
Deriche and Paragios f|17j). For variational techniques, we refer to the works of De- 
riche et al. ( 3 ) and Barlaud et al. ( 2 ). At last, mathematical morphology has been 
more and more used these last ten years, see the works of Salembier, Serra and their 
teams (0])- 

In this paper, based on the former work [H^ concerning moving object segmenta- 
tion, we focus on two different techniques, the first one relying on the recent result 
of jTJ (the same results were derived independently, and previously, by Darbon and 
Sigelle in a probabilistic setting) and the second one is the use of graph cuts 

(Boykov, Veksler, and Zabih 8_, Kolmogorov and Zabih 25 ). 

The result of jllj states that solving the Rudin-Osher-Fatemi Total Variation regular- 
ization problem [321 and thresholding the result at the level a gives the region that is 
solution of the shape optimization problem The idea of the proof relies on the fact 
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that the total variation of a function can be reconstructed from the perimeters of its 
level sets: it is the famous coarea formula. Former works rely also on the coarea for- 
mula: in [TH]. the authors propose to use it to propose a new scheme for TV diffusion 
and improve its efficiency in ^S] using a level set decomposition of the image; Chan, 
Esedoglu and Nikolova in ^3] solve a Mumford-Shah/Chan-Vese (|28|.|12|) problem 
with fixed means by a TV-regularization and state also an equivalence result between 
some special shape optimization problem and a TV regularization one with L 1 norm 
data fidelity term. 

In this paper, we use the framework of 11 in the case of a non- homogeneous total 
variation functional, corresponding to a weighted anisotropic perimeter like the one 
studied in The outline is the following : in the first part we present the energy 
used to segment moving objects in the image in the second part and we expose formal 
mathematic arguments for the use of TV regularization. It is followed by a mathe- 
matical part about TV regularization and results about the equivalence with solving a 
class of shape optimization problems, and by a part where we present graph cuts and 
their use for our functional. It is followed by an experimental part where we show the 
results obtained. The last part is dedicated to an automatic moving objects detection 
performed by a contrario statistical methods on the result obtained by total variation 
regularization (previous parts). We compare it to the previously shown methods. 

2 A shape optimization problem for moving object de- 
tection 

2.1 The functional 

Once we have determined the optical flow, we keep it for the segmentation purpose. We 
will denote f2 the moving region and D the image domain. As a moving object should 
be characterised by a sufficiently large flow magnitude, it seems natural to incorporate 
JqO( — \v\(x)dx to the energy we want to minimize, where a — \v\(x) have to take 
different signs on the image domain, otherwise the solution of the shape optimization 
problem will be trivial. As we want the boundary of Q to remain stable in the presence 
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of noise or spurious variations, we also penalize the total length of this boundary (that 
is, the perimeter of fi) in our functional. Finally, as thresholding the optical flow will 
not give exact object contours (due to the temporal integration), we add a weighted 
perimeter which integrates a function of the gradient (here gj = 1+ |yjp ) along the 
boundary. It gives the functional 

adx + \v\dx + \ gi(x)dS + fi / dS (2) 
Jn JD\n Jan Jan 

where dS denotes the arclength variation along the boundary. For simplicity notations, 

we will denote Xgi + /i by g. Finally, our functional is 

adx+ \v\dx+ I g(x) dS (3) 
Jn Jd\u Jan 

Within the framework of shape sensitivity analysis (see Murat and Simon J22], Delfour 

and Zolesio |16j^. one can compute the shape derivative of this functional and obtain 

the steepest gradient descent. Combining it to the famous level set method (Osher, 

Sethian, 30 ), we would obtain 

du . ( ( Vn 

Vit I V — a + div g- 



dt 1 1 V V |Vtt| 

Another similar method is to use u as the unknown of the functional and not f2 : the 
integral over f2 {resp. D\Q) is replaced by integrals over D with the weight H e (u) 
(resp. 1 — H e (u)) and the boundary term by the integral over D with the weight 
|V(i? e (u))|. Let notice that a parameter e is needed in this method for computing 
8 e and H e which are C°° regularizations of Dirac and Heaviside distributions. The 
obtained PDE, leading to the same curve motion than the previous one, is 

du ( . ., / Vu 

d e {u) I |v| — a + div g- 



dt w V V |V«| 

That was done in |31| . unfortunately, if we want to adjust the value of a in a suitable 
way, we have to recompute the result by this partial differential equation as many times 
as necessary. We overcome this problem by using the equivalence between solving the 
ROF model with a weighted total variation and solving of (j2J for all the possible values 
of a. 
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In functionals do not involve standard perimeter but a different anisotropic one. 
This is for theoretical reasons explained in : the discrete total variation does not 
satisfy the coarea formula which is needed in the main result of In fact, the 

theory can be developped with the isotropic total variation in the continuous setting, 
and results could still be (approximately) computed. 

Thus we slightly modify the functional to fit in the framework given in jllj {y denotes 
the outside normal to the boundary and | • |i the 1-norm : |(a,6)|i = |a| + |6|, Rtl 
denotes the rotation of angle f) 

E(Q)= [ adx+ [ \v\dx + \ I ^(x)(|i/|i + |iiir(i/)|i)d5. (4) 
Jn Jd\q * Jan 4 

This is a change of metric : the standard length and its weighted counterpart are 
replaced by what it is usually called "Manhattan" or "taxicab" length. We could keep 
only f dn g{x) \ v\\ dS but f an | g>(a;)(|^|i + \R*(v)\i)dS is useful to not overestimate 
the length of diagonal linear parts of the boundary of 0. We introduce the weighted 
isotropic and anisotropic total variations 

TV g (u) := / g\Du\ and TV^u) := \ [ g(\Du\i + \R*(Du)\i) , 

JD 1 J D 

(notation 1 refers to the 1-norm of the normal and g to the weight function) so that 

g(x) dS are respectively 

the anisotropic weighted perimeter and the weighted perimeter. We denote A g (d£l) = 
TVi t g(xn) and L g {d£l) = TV g (xn)'- these two perimeters satisfy 

ciL g (dQ) < A g {dQ) < c 2 L g (dQ) 

with ci = 1+ y^ , C2 = , 1 , and thus if the boundary of f2 has a finite L„, it has 
finite A g , and conversely. 

At last, we rewrite our functional in discrete setting, as this will be in the rest of the 
paper 
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Let us observe that the weight gi j could be different on each edge (connecting two 
neighboring pixels) of the grid and that the choice we have made is quite arbitrary. 
However, we did not observe a significant change in the output when weighing the 
edges in a different way. 

2.2 Remarks about the minimization 

As we have seen, a functional like @ is usually minimised using shape sensitivity 
analysis j2H ESI IS] ; classical calculus of variation (see for example ) or heaviside 
function techniques (Chan-Vese, All of those are gradient-descent methods. In 

I14 [ ITnj. it is shown that the solutions of the discrete shape optimization problem 



(i is an index of the pixel number and plays the role of the characteristic function 
of the shape, / is a data function [in our problem it is the optical flow norm] and J is 
a total variation, though it could be another function satisfying the same properties, 
this will be described in section |2J) can be obtained by computing the solution of the 
Rudin-Osher-Fatemi total variation regularization problem 



and just threshold the result u at the level a. This has two advantages over classical 
snakes methods like the ones cited above. First, it gives a global minimum of the 
shape optimization problem, which is not necessarily the case of the classical snakes 
methods, since the gradient descent may be trapped into local minima. Secondly, if 
we want to find the most appropriate value of a, we have just to compute once the 
solution of the ROF problem and to threshold at different levels in order to decide the 
value we keep; by any other method, we would be obliged to repeat the minimization 
as many times as the number of values of a we would like to compare. With the 
projection algorithm for computing the solution of the ROF problem (see section . 
we inherit of another slighter advantage : we avoid introducing additional parameters 
which are required to approximate either the total variation in usual solving by PDE, 




min — — 1 1 7i - /|| 2 + J(u) 
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or Dirac and Heaviside functions (see ^2] f° r details). 

It is known since Greig, Portehous ans Seheult [20] that energies (J3J) and © can 
be exactly minimized. More recently, Kolmogorov and Zabih in [23] proposed the 
'"graph cuts'" algorithm as a way to minimize such type of energies. We will detail 
about it in the section |1J It leads to a global minimum, but the second advantage 
of TV regularization does not occur here : we have to compute the solution of the 
shape optimization problem as many times as necessary if we want to optimize the a 
parameter. As a single graph cut computation requires approximately 0.5 second and 
the ROF solution about 1 minute (on an image of size 256 x 256 on a laptop equiped 
with a 1.8 GHz Pentium 4 and 1 Gb of RAM), graph cuts are better for a computation 
for a fixed value of a, but if we want to choose many different values of a, the ROF 
solution computation should be more indicated. 

3 On the equivalence of total variation regularization and 
a class of shape optimization problems 

In this section, we will use the following notations : | • | denotes the euclidean norm 
| (a, 6)| = Va 2 + b 2 , \ ■ \ p denotes the p-norm | (a, b)\ p = (\a\ p + |&| p ) 1/,p and | • |oo denotes 
the oo-norm |(a, 6)|oo = sup(|a|, \b\) 

3.1 Settings 

In this section, we recall the main results obtained in |Tl. . The problem considered is 



where X is the space of functions defined on the N pixels of the image grid (i denotes 
the pixel index and / is still a data function) . The function J : X — > M + satisfies four 
properties. 

• Convexity: J{tu + (1 - t)v) < tJ(u) + (1 - t)J{v) for any u, v £ X and t G [0, 1], 

• lower semicontinuity, 
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• 1-homogeneity : J(tu) = tJ(u) for any t > and u G X, 

• it satisfies also the generalized co-area formula 

/+oo 
J(l u >t)dt (5) 
-oo 

where l u >t denotes the indicator function of the upper level set of u. 
3.1.1 Main theorem and extensions 

We consider the Rudin-Osher-Fatemi TV regularization problem 

m mJ(u) + ^-\\u-f\\ 2 (6) 
and the discrete shape optimization problem 

min XJ(9) + y2(a-f i )9 l (7) 

i 

The main theorem of states an equivalence between solving (JHJ) and thresholding 
the result at threshold a and solving I ffl l- As we are concerned only with solving ([7)l. 
we give only the part of the theorem which states that thresholding the solution of 
the discretized ROF model gives a solution of the shape optimisation problem. 

Theorem 1 ([1^) Let w solve Then, for any s £ TSL, both wf = l Wi>a o-nd 

wf = l Wt>3 solve TO. If w s = w s , then the solution of TO is unique. 

In it is the discrete Manhattan total variation that is used 

J ( u ) = ^2 \ u i+hj ~ u i,j\ + K,i+1 ~ u hj\ 
ij 

which is dicretized from the continuous 1-TV introduced in the previous section. If 
we want a more isotropic and ^-rotationnally invariant Manhattan TV, we may take 
diagonal terms into account 

. . . 1 v-^, . . 

2 l^+bj ~ u hj\ + PiJ+l ~ u i,j\ + ITJk \ U i+lj+l ~ u i,j\ + \ u i-l,j+l ~ u i,j\i 
i,3 ij 

which is discretized from ^ J D \ Vu\ \ + \ j D \ Vu- ei \ + \ Vu- e%\ where e\ = ^) and 
e2 = e.\ ■ The second term can be seen as a Manhattan TV in another basis, actually 



it is exactly j D \Rk(S/u)\% where is the rotation of angle \. The discrete standard 
TV 

TVl t g(u) = ^ W^Ui+lJ - U it j\ 2 + \Uij + l - Uij\ 2 

do not fit in the frame described here since it does not satisfy the generalized coarea 
formula, though being the discretized version of the total variation in the standard 
definition given in the previous section. 

As the Theorem ^ is stated for any function J satisfying the four conditions given 
above and the Manhattan discrete TV satisfy them. It is straightforward to extend it 
to a g-weighted Manhattan TV 

y~]9i,j (K+1J - u i,j\ + - Uij\) 

then to the more isotropic 

TV lt iL ig (u) = i ^2gi,j (K+ij - u i,j\ + K,i+i - u i,j\) 

i,j 

+ ^-^X^ffiJ (k+lj+l - Ui,j\ + - u i,j\) ( 8 ) 

in which we are concerned in this paper. 

3.2 The projection algorithm of [10J 

In ^U], a new algorithm for computing the solution of @ was proposed. It is based 
on duality results and consists in finding the projection of / onto a convex set. Let us 
describe how it works on the energy we are interested in. Here we follow the calculus 
of which generalize well to the g-weighted Manhattan TV 
The energy considered is thus 

TV XA » = (|(V*«)y| + |(V*«)y|) + ~ (|(V^)^| + |(V^u) M |) 

where we have rewritten the expression of El By now, we denote Vt« = (V x w,V y w) 
and Vw = (V xy w,V yx w). 

From discrete gradients, we get the definition of discrete divergence div = — V* 
(div£,w)x = -(Z,Vw)xxx, Vw£l,$elxl, 
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and similarly with is rotated counterpart div' = —(V')* 

(div'e,^)^ = -(Z,V'v>)xxx, Vwel^elxl, 

In it is stated that the solution of 

min V|(Vw)ij| + —\\w - w f 
uiSA * — » AA 

(where \(Vw)ij\ is the euclidean norm of (Vw)ij) is given by w = Wq — Adiv£ where 
£ is a solution to 

min{||Adiv£ - ™of\ t G X x X, \£\ < 1}. 

Moreover, one has • (Vw)ij = \Vw\ij for all (i, j). As this duality problem relies 
on the property 

with | + | = 1 with p£ [1, +oo] (for p = oo, q = 1 and conversely) and as we have 
q = 1 for Manhattan TV, the constraint \£i t j\ < 1 is replaced by |£jj|oo < 1> that is to 
say |^fj| < 1 and < 1. For g- "weighted" Manhattan TV, as we want to realize 

Z-Vw< |£UVHi < ^|v«j|i, 

the constraints become \Cfj\ < <?ij and < <?jj. If we consider the full TVi 
we have the part of Manhattan TV expressed in the basis (ei,e2). This leads to 
another vector field 77 wich satisfies the same properties as £. All the constraints can 
be renormalized by the function g equivalently : we replace div(£) by div(<?£) and 
|£| < 9 by |£| < 1, and for 77 in the same way. Let introduce the compact set (the 
overlining denotes the closure) 

K = {div(gO+div'(gi 1 )\(Z,r ] ) £ A*} 

where 

a = { p = (p x , p y) £ x x x, \pfj < 1, ip^i < 1}. 

From the definition of the total variation TV\ t s. t g, we get 

TVi* >g (w)= sup (w,dw(g£)) x + SU P ( w > div' (5 77)) „ = sup (w, v) x . 

l?|oo<i M«,<1 veK 
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(div'/)i,j — ~/=(fij ~ fi-xj+i + fij — fi-ij+i) 



Exactly in the same manner than in it can be established that the solution of 
the ROF problem is given by the orthogonal projection of uu onto \K This is for 
constraints simplicity. Finally, the solution of 

mm^^^ + ^Ww-woW 2 (9) 

is given by w = wq — ^ (Adiv(g£) + Adiv' (gfj)) where (£, fj) is a solution to 

, min „ II o ( Adiv (f + Adiv' (grj)) - w \\ 2 (10) 

Let us mention that the div' operator is different from the div one as it is the conjugate 
of the gradient in the basis (ei, e%). It is simply given by (denoting / = (f 1 , f 2 )) 

1 

The Karush-Kuhn- Tucker conditions yield the existence of Lagrange multipliers ajj > 
0, afj > 0, Plj > 0, fifj > associated tot the constraints in ((TDJ) that are (£^) 2 < 1, 
(£i,j) 2 — 1) (Vij) 2 — 1) (^i,j) 2 — These Lagrange multipliers satisfy 

-AffijV(-(div( 5 e) + div'( OT )) - ^o)i,i + 2K,4,, a? i ^ ii ) T = 

-A 5iJ V'(^(div( 5 + div'(OT)) - w )ij + 2(Pl J r 1 l j ,Pl 3 r,l J ) T = 

with either a 1 > (and similarly for a 2 , (5 1 and /? 2 ) and Thus 

°ij = l%,i|V x (|(div(^) + div'(OT)) - u;o)| 
= lA^lV^Kdiv^+div'^T?))-^)! 
ft d = ±\ 9i j\V x y^(div(gO + div'(g V ))-w )\ 
&lj = |A ftJ |V^(|(div(5C) + div'( OT ))-^ Q )| 
Then, we obtain a fixed-point algorithm similar to the one proposed in 

w n = \ (Adiv(g f n ) + Adiv' (g r/ n )) - w 

(3 
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Following the convergence proof of ^U|> we obtain the convergence theorem 



Theorem 2 Let r < 



— 8 maxij gi t j 



Then, Adiv(g£ n ) + Xdxv (grj n ) converges to the or- 



thogonal projection of wq onto the convex set XK as n — > oo, and w n converges to the 
solution of |PJ). 

4 How to minimize the energies with graphcuts 

4.1 Principle 

Greig, Portehous and Seheult proved in 20 that discrete energy minimization can be 
exactly performed. Graphcuts have been introduced in computer vision by Y. Boykov 
and his collaborators in |Sj as an algorithm for this type of minimization. They have 
been extended to many areas : stereovision medical imaging 7 ... The idea is to 
add a "source" and a "sink" in such a way that to each point in the image grid a link 
is created to either the source or the sink. A cost is assigned to the links so that the 
global cost be related to the energy. Finally, solving the energy minimization problem 
is equivalent to find a cut of minimal cost along the graph (source-points-sink). This 
is achieved by finding a "maximal flow" along the edges of the graph, due to a duality 
between min-cut and max-flow problems, first observed by Ford and Fulkerson. 

4.2 Construction 

We recall the energy is (we replace A + fig by g for simplicity) 



The coefficients w X:V are given by w((i,j), (i ± 1, j)) = w((i,j), ± 1)) = g^ and 




which gives, with simpler notations (we denote a pixel x = (i,j 




w((i,j),(i±l,j±l))=w((i,j),(iTl,j±l)) = Y9(ij)- 
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One can see that the weights w XjV are nonsymmetric : w x ,y ^ w y}X due to the presence 
of g which has a dependency with respect to the pixel. However there is no particular 
problem to introduce nonsymmetric weights, as Kolmogorov and Zabih have shown in 
|25| that graphcuts can handle energies involving an interaction term which satisfies 

Ei n ter(0, 0) + £Jj nter (l, 1) < Ei nter (0,l) + i?j nter (l, 0). 

Then, we build the graph Q = (V, £ ) made of vertices V = {i, i = 1, N} U {t} U {s} 
and whose edges are 

£ = {(x,y)\w Xi y > 0} U {{s,x)\ 1 < x < N} U 1 < x < N}. 

As a cut of this graph define a partition (V s ,Vt) of the graph into two sets, the first 
one containing the source and the second one containing the sink, the global cost of a 
cut is given by 

E(V s ,V t )= 

aeV s ,beV t 

So what we would like to realize is E(V s ,Vt) = J(&)- The construction is given by 
Kolmogorov in [23], h consists in assigning the weight w X:V to an edge e = (x,y) £ £ 
in the image grid, the weight a + max, G{ to the edges (s, x) and maxj Gi — G{ to the 
edges (x,t), then the equality between the global cost and the energy holds. 



5 Experimental results 

All the experiments whose results are presented here were performed on a laptop 
equiped with a 1.8GHz Pentium 4 and 1 Gb of RAM. 

5.1 Experiments with optical flow 

For the implementation, we have used the maxf low-v2 . 1 and energy-v2 . 1 graphcuts 
implementation of V. Kolmogorov, available at http : / / www . cs . Cornell . edu/People/vnk/sof tware . 



Type of capacities has been set to double, though short or int leads to faster com- 
putation when quantized quantities are chosen in input. 

The optical flow is computed by the Weickert and Schnorr method [SSJ with a mul- 
tiresolution procedure (see jH]). As optical flow computation has been improved since 
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the Weickert and Schnorr spatiotemporal model (using mixed model combining local 
and global information, using intensity or gradient intensity...), we emphasize that our 
purpose is not to obtain a very precise estimation of the optical flow but to show how 
we can improve this with the g-weighted term and thus to obtain a segmentation as 
close as possible to the image edges. Figure 1 shows results obtained successively with 
TVx >g , TVi t g t 2£ and weighted standard perimeter TV g . One can see the result obtained 
with Manhattan perimeter (diagonal neighbors) is quite competitive with the one ob- 
tained with standard perimeter, especially it is more isotropic, which is precisely what 
is aimed. Parameters are chosen from previous computations with classical snakes 
(see |31p. The values are set in relation with the range of value of the optical flow 
amplitude. For the weighted standard perimeter, the result is obtained in 0.24 or 0.25 
second on all the images (of size 256 x 256) of the sequence. For weighted Manhattan 
perimeter involving diagonal neighbors, the time is of 0.11, 0.12 or 0.13 second. Same 
times are obtained with weighted Manhattan perimeter, though it can reach 0.09 or 
0.10 second on some images. All of these are obtained with the clock() C command. 

The figure 2 shows the results obtained by solving the ROF problem and thresh- 
olding the function. We emphasize again that it is a major advantage over all previous 
way for solving this problem, since the function gives us all the solutions of the shape 
optimization problems depending on a. The segmentation shown on figure 3 are indeed 
obtained simply by thresholding the function at the levels indicated (0.5, 0.7 and 0.8). 
As we had reasonable values of a from previous computations with classical snakes 
(|H3), we just tried a few values, but one could choose a in a more sophisticated way, 
adapted to the histogram of the function solving the ROF model. Such parameter 
optimization could also be applied in the same way to a functional that was used by 
Jehan-Besson, Barlaud and Aubert in [2] for video segmentation purpose (actually it 
even inspired the work 31 ) 

J(n)= [ adx + [ \B-I\(x)dx + X [ dS 

where B represent a background image and I the current image in the movie. In the 
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Figure 1: Results obtained with graphcuts with the energy involving TVi tg (first image on 
top left), TVi ;Sj |. (top right) and TV g (bottom). The initial data is the optical flow norm v. 
Parameters are a = 0.6, A = 0.2 and [i = 10. 
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Figure 2: Results obtained (10th image of the sequence) with TVi, g and TVi, ff ,j and the 
optical flow norm as initial data. Parameters are a = 0.6, A = 0.2 and [i = 10. Notice the 
smoothness of the result on the right image in comparison to the one on the left image. 

discrete formalism which is used in this paper, it gives 

YJ^-\B - I\{%y)Bi + \TV x {Q). 

i 

In this case it is a direct application of the previous work (T^ (as before we have to 
modify the perimeter to a Manhattan perimeter). The background can be computed 
using more or less sophisticated methods. We tried time median filter and the method 
proposed by Kornprobst, Deriche and Aubert 3 . Some results are shown on figure 4 
for a = 10, 15, 20, 25 and A = 50. 

Here is the computational time (measured in seconds with the clock () C com- 
mand) of the algorithm (using the model described in [3^ ) ° n the ten first images of 
the sequence for the total variation minimisation algorithms (images are 256 x 256, 
parameters are a = 0.6, A = 0.2 and [i = 10). Iterations were stopped when the 
maximum of the two residues between £ n and and between rf 1 and r] n+1 become 
lower than 0.002, a maximum of 2000 iterations being set to prevent the algorithm to 
become too slow. The time step is r = 0.1. Such a value could be quite high, as we 
have indicated the time step should be lower than „ „ 1 — - — , but a simple trick is to 
write g = gm&xg, which changes the regularization parameter from 1 to max g, and 
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Figure 3: Influence of the a parameter. Results obtained (10th image of the sequence) with 
total variation minimisation with TV\ t9 ^ and the optical flow norm as initial data. Parameters 
are A = 0.2 and fi = 10. From left to right and top to bottom : a = 0.5, 0.6, 0.7 and 0.8. 
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Figure 4: First image on the top : background image computed by time median filter. Results 
obtained (10th image of the sequence) with total variation minimisation (Manhattan with 
horizontal, vertical and diagonal neighbors TV\ tg ^) for the Jchan-Besson-Aubert-Barlaud 
model (initial data: \B — Parameters are A = 50. From left to right and top to bottom : 
a = 10,15,20,25. 
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Figure 5: computational time of the TV regularization solving algorithm with TVi, S) a. The 
residue r = max(||£™ +1 - Clli 2 . - V n \\p)- 

thus has no incidence over the time step condition, as this one does not depend on 
the regularization parameter. One could think the precision value is too low and leads 
to a quite heavy computational time, however, we have noticed that for a precision of 
0.01, the result is not sufficiently good for level sets extraction (see figure 5 where a 
result is displayed for precisions 0.01 and 0.002) 

6 Moving objects segmentation by a contrario detection 

The method described here is inspired from previous works of Pelletier, Koepfler and 
Dibos [25 an d Caselles, Garrido and Igual [21]. The purpose is to decide whether a 
pixel is meaningful or not. In our case, the data is the solution of the ROF problem 
with the particular choice of the weighted total variation. The meaningfulness is 
decided between two hypothesis: H° "there is motion" and H 1 "there is no motion". 
The classical approach of hypothesis testing (hypothesis testing model) is to suppose 
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Figure 6: computational time of the TV regularization solving algorithm with TVi t9 . The 
residue r = max(||£™ +1 - f l ||;2, - V n \\p)- 




Figure 7: Results obtained (10th image of the sequence) with total variation minimisation 
(Manhattan with horizontal, vertical and diagonal neighbors) for two different precisions : 
0.002 (left image) and 0.01 (right image). Parameters are A = 0.2 and ^ = 10. 
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that H is true and to have a look at the observations under this assumption. Another 
approach (a contrario model) consists in deciding under the assumption that H 1 is 
true. This was introduced in as a statistical method to provide a decision tool which 
simulates the Gestalt laws. The basic principle (Helmholtz principle), is is based on 
the fact that every large deviation from the noise should be perceptible and thus is 
decided to be meaningful. 

Around a pixel, we design a neighborhood N(x) of size N = n x n and we define the 
random variable 

S * = Jf E <KIV(V)I) 

yt=N(x) 

where if] : R — > [0, 1] is a function designed to renormalize the data between zero 
and one. The pixels {y E N(x)} are assumed to be "independent" and V denotes the 
random variable associated to the solution of the ROF problem with the optical flow 
amplitude as initial data. 

Let E x the observed value of £ x . There is motion if E x is sufficiently high. Then 
under the assumption that H 1 is true, the rejection test is [£ x > 6], 5 > 0. But we 
do not compute the value of 5 for a given level of meaningfulness as it is usually done 
in hypothesis testing, we compute the probability f[£ x > E^H 1 ] which is the motion 
probability for the observed value E x . For its evaluation, we need the Hoeffding's 
Theorem [21], once we have estimated the mean of the random variable ip (\V(y) |) 
from the observed values. 

Theorem 3 (Hoeffding 1963) Let Y 1 ,...,Y N be independent variables with [i % = 
E(Y i ) G (0,1) and P[0 < Y* < 1] = 1 for all i = 1,...,N. Let fj, = ^+-+^" , Then, for 
0<t<l-fiandY = yl+ -j v +yJV , 

P[F - fx > t] < exp(-NH(fi + t, fj,)) 

where H(x,y) = xlog(|) + (1 - x)log(jE^) 

Then we define the expected number of false alarms 

Definition 1 (NFA of a pixel) The number of false alarms is defined as: 

NFA{x) =NtotW[£x > E^H 1 ] 
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where Mtot is the total number of pixels in the image. 

The rejection of H 1 is decided if the NFA is lower than a parameter e. For the 
estimation of (i, we simply compute the empirical mean of E x over the entire image: 



for ft < E x < 1. 

The main reservation of the application of this framework to the solution of the ROF 
problem is that the independency of this quantity over a neighborhood is not veri- 
fied. However, we would like to emphasize that the dependency should exist only on 
the part of level lines included in the neighborhood. The TV regularization does not 
smooth accross the edges but along the edges. The second reason of this use of the 
Hoeffding formula is that practically, we do not notice any problem to apply this. 
A post-treatment is done in order to take into account the fact that the region detected 
should slightly surround the true motion region, due to the neighborhood constructed 
around each pixel. We simply erode the mask obtained by a radius of half the neigh- 
borhood radius. At the end, we can compute the level set of the ROF problem solution 
which has the minimal difference with the result obtained with the a contrario detec- 
tion. 

We present results on figures 6 and 7. On the figure 6 (resp. 7), the observation is the 
result of the ROF problem with weighted TV and optical flow norm as initial data 
(resp. difference image B — I); on the left image is the result obtained from the a 
contrario detection (plus erosion), the closer level set is shown on the right image. We 
can notice the a contrario method is not able to discriminate between two moving cars 
in the image. 




Using the Hoeffding formula, a sufficient condition of rejection is then 



H{E x ,p) > -rjrlogC 



Mot 



e 
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Figure 8: a contrario detection with the optical flow magnitude regularized by TVi, fl ,». The 
neighborhood radius is N — 3. The e parameter is set to one as it is usually done. The left 
image is the basic result of the a contrario detection eroded with a radius of 1. The right image 
is one level set of the ROF solution which looks like best the a contrario detection result. 





1^ ^ 







Figure 9: a contrario detection with the difference image between the current image and the 
background regularized by TVi i9i ^. The neighborhood radius is N = 3. The e parameter is 
set to one as it is usually done. The left image is the basic result of the a contrario detection 
eroded with a radius of 1. The right image is one level set of the ROF solution which looks 
like best the a contrario detection result. 
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7 Conclusion 



In this paper, we have extended the main result of in order to handle shape 
optimization functionals involving weighted anisotropic perimeter. It states that all 
the solutions of some shape optimization problems depending on a parameter a are a- 
level sets of the solution of the Rudin-Osher-Fatemi problem. Thus the algorithm used 
for total variation regularization — as in — allows to compute all the solutions 
for different values of a in one pass. This is in our opinion the main advantage 
over classical snakes methods like Chan and Vese one in this particular type of shape 
optimization. 

On the other hand, we have also minimized the discrete version of the functional with 
graph cuts techniques. The main advantage of this method is that it is very fast, and 
whenever the advantage of the TV-minimization algorithm does not occur when we 
employ graph cuts, even a great number of computations of the algorithm leads to 
a very competitive computational time (close to a single computation of a classical 
continuous snake algorithm). 

We have used these both methods on two video segmentation models : one introduced 
in [3^ in which weighted perimeter is involved and a previous one introduced by 
Jehan-Besson, Barlaud and Aubert We would like to emphasize that the general 
model studied in the theoretical part of the paper covers many applications. One could 
think for example about segmentation with shape priors, using a perimeter weighted 
by a distance to the prior. Such models have been used by Freedman and Zhang \2'2\ . 
or by Gastaud, Jehan-Besson, Barlaud and Aubert [2*5]... 

We have also proposed o use an a contrario method for region finding with the result 
of TV minimization process but without extracting a level set at a predefinite level. 
This method do not lead to the most satisfactory results, but it is very fast since it 
does not require to choose a value of the parameter in particular and thus is better 
indicated for real-time applications. 
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